78 research outputs found

    Learning to Learn to Disambiguate: Meta-Learning for Few-Shot Word Sense Disambiguation

    Get PDF
    The success of deep learning methods hinges on the availability of large training datasets annotated for the task of interest. In contrast to human intelligence, these methods lack versatility and struggle to learn and adapt quickly to new tasks, where labeled data is scarce. Meta-learning aims to solve this problem by training a model on a large number of few-shot tasks, with an objective to learn new tasks quickly from a small number of examples. In this paper, we propose a meta-learning framework for few-shot word sense disambiguation (WSD), where the goal is to learn to disambiguate unseen words from only a few labeled instances. Meta-learning approaches have so far been typically tested in an NN-way, KK-shot classification setting where each task has NN classes with KK examples per class. Owing to its nature, WSD deviates from this controlled setup and requires the models to handle a large number of highly unbalanced classes. We extend several popular meta-learning approaches to this scenario, and analyze their strengths and weaknesses in this new challenging setting.Comment: Added additional experiment

    Modeling brain activity associated with metaphor processing with distributional semantic models

    Get PDF
    In this study we investigate how lexical-semantic relations associated with the literal meaning (and abstract meaning) are being accessed across the brain during familiar metaphor comprehension. We utilize a data-driven whole-brain searchlight similarity-decoding analysis. We contrast decoding metaphoric phrases (”she’s grasping the idea”) using distributional semantic models of the verb in the phrase (VERB model) versus that of the more abstract verb-sense (PARAPHRASE VERB model) obtained from literal paraphrases of the metaphoric phrases (”she’s understanding the idea”). We showed successful decoding with the VERB model across frontal, temporal and parietal lobes mainly within areas of the language and default-mode networks. In contrast, decoding with the PARAPHRASE VERB model was restricted to frontal-temporal lobes within areas of the language-network which overlapped to some extent with signiïŹcant decoding with the VERB model. Overall, the results suggest that lexical-semantic relations closely associated with the abstract meaning in metaphor processing are largely localized to language and amodal (multimodal) semantic memory systems of the brain, while those more associated with the literal meaning are processed across a distributed semantic network including areas implicated in mental imagery and social-cognitio

    Examining Modularity in Multilingual LMs via Language-Specialized Subnetworks

    Full text link
    Recent work has proposed explicitly inducing language-wise modularity in multilingual LMs via sparse fine-tuning (SFT) on per-language subnetworks as a means of better guiding cross-lingual sharing. In this work, we investigate (1) the degree to which language-wise modularity naturally arises within models with no special modularity interventions, and (2) how cross-lingual sharing and interference differ between such models and those with explicit SFT-guided subnetwork modularity. To quantify language specialization and cross-lingual interaction, we use a Training Data Attribution method that estimates the degree to which a model's predictions are influenced by in-language or cross-language training examples. Our results show that language-specialized subnetworks do naturally arise, and that SFT, rather than always increasing modularity, can decrease language specialization of subnetworks in favor of more cross-lingual sharing

    Neural Character-based Composition Models for Abuse Detection

    Get PDF
    The advent of social media in recent years has fed into some highly undesirable phenomena such as proliferation of offensive language, hate speech, sexist remarks, etc. on the Internet. In light of this, there have been several efforts to automate the detection and moderation of such abusive content. However, deliberate obfuscation of words by users to evade detection poses a serious challenge to the effectiveness of these efforts. The current state of the art approaches to abusive language detection, based on recurrent neural networks, do not explicitly address this problem and resort to a generic OOV (out of vocabulary) embedding for unseen words. However, in using a single embedding for all unseen words we lose the ability to distinguish between obfuscated and non-obfuscated or rare words. In this paper, we address this problem by designing a model that can compose embeddings for unseen words. We experimentally demonstrate that our approach significantly advances the current state of the art in abuse detection on datasets from two different domains, namely Twitter and Wikipedia talk page.Comment: In Proceedings of the EMNLP Workshop on Abusive Language Online 201

    MetaVR: Understanding metaphors in the mind and relation to emotion through immersive, spatial interaction

    Get PDF
    Metaphorical thinking acts as a bridge between embodiment and abstraction and helps to flexibly organize human knowledge and behavior. Yet its role in embodied human-computer interface de- sign, and its potential for supporting goals such as self-awareness and well-being, have not been extensively explored in the HCI community. We have designed a system called MetaVR to support the creation and exploration of immersive, multimodal, metaphoric experiences, in which people’s bodily actions in the physical world are linked to metaphorically relevant actions in a virtual reality world. As a team of researchers in interaction, neuroscience, and linguistics, we have created MetaVR to support research exploring the impact of such metaphoric interactions on human emotion and well-being. We have used MetaVR to create a proof-of-concept interface for immersive, spatial interactions underpinned by the WELL-BEING is VERTICALITY conceptual mapping—the known association of ‘good’=‘up’ and ‘bad’=‘down’. Researchers and developers can currently interact with this proof of concept to configure various metaphoric interactions or personifications that have positive associations (e.g., ‘being like a butterfly’ or ‘being like a flower’) and also involve vertical motion (e.g., a butterfly might fly upwards, or a flower might bloom upwards). Importantly, the metaphoric interactions supported in MetaVR do not link human movement to VR actions in one-to-one ways, but rather use abstracted relational mappings in which events in VR (e.g., the blooming of a virtual flower) are contingent not merely on a “correct” gesture being per- formed, but on aspects of verticality exhibited in human movement (e.g., in a very simple case, the time a person’s hands spend above some height threshold). This work thus serves as a small-scale vehicle for us to re- search how such interactions may impact well-being. Relatedly, it highlights the potential of using virtual embodied interaction as a tool to study cognitive processes involved in more deliberate/functional uses of metaphor and how this relates to emotion processing. By demonstrating MetaVR and metaphoric interactions designed with it at CHI Interactivity, and by offering the MetaVR tool to other researchers, we hope to inspire new perspectives, dis- cussion, and research within the HCI community about the role that such metaphoric interaction may play, in interfaces designed for well-being and beyond

    Scientific and Creative Analogies in Pretrained Language Models

    Full text link
    This paper examines the encoding of analogy in large-scale pretrained language models, such as BERT and GPT-2. Existing analogy datasets typically focus on a limited set of analogical relations, with a high similarity of the two domains between which the analogy holds. As a more realistic setup, we introduce the Scientific and Creative Analogy dataset (SCAN), a novel analogy dataset containing systematic mappings of multiple attributes and relational structures across dissimilar domains. Using this dataset, we test the analogical reasoning capabilities of several widely-used pretrained language models (LMs). We find that state-of-the-art LMs achieve low performance on these complex analogy tasks, highlighting the challenges still posed by analogy understanding.Comment: To be published in Findings of EMNLP 202

    A Comparison of Architectures and Pretraining Methods for Contextualized Multilingual Word Embeddings

    Full text link
    The lack of annotated data in many languages is a well-known challenge within the field of multilingual natural language processing (NLP). Therefore, many recent studies focus on zero-shot transfer learning and joint training across languages to overcome data scarcity for low-resource languages. In this work we (i) perform a comprehensive comparison of state-ofthe-art multilingual word and sentence encoders on the tasks of named entity recognition (NER) and part of speech (POS) tagging; and (ii) propose a new method for creating multilingual contextualized word embeddings, compare it to multiple baselines and show that it performs at or above state-of-theart level in zero-shot transfer settings. Finally, we show that our method allows for better knowledge sharing across languages in a joint training setting.Comment: 7 pages, 6 figure

    Joint Modelling of Emotion and Abusive Language Detection

    Get PDF
    The rise of online communication platforms has been accompanied by some undesirable effects, such as the proliferation of aggressive and abusive behaviour online. Aiming to tackle this problem, the natural language processing (NLP) community has experimented with a range of techniques for abuse detection. While achieving substantial success, these methods have so far only focused on modelling the linguistic properties of the comments and the online communities of users, disregarding the emotional state of the users and how this might affect their language. The latter is, however, inextricably linked to abusive behaviour. In this paper, we present the first joint model of emotion and abusive language detection, experimenting in a multi-task learning framework that allows one task to inform the other. Our results demonstrate that incorporating affective features leads to significant improvements in abuse detection performance across datasets.Comment: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 202

    Multilingual and cross-lingual document classification: A meta-learning approach

    Full text link
    The great majority of languages in the world are considered under-resourced for the successful application of deep learning methods. In this work, we propose a meta-learning approach to document classification in limited-resource setting and demonstrate its effectiveness in two different settings: few-shot, cross-lingual adaptation to previously unseen languages; and multilingual joint training when limited target-language data is available during training. We conduct a systematic comparison of several meta-learning methods, investigate multiple settings in terms of data availability and show that meta-learning thrives in settings with a heterogeneous task distribution. We propose a simple, yet effective adjustment to existing meta-learning methods which allows for better and more stable learning, and set a new state of the art on several languages while performing on-par on others, using only a small amount of labeled data.Comment: 11 pages, 1 figur
    • 

    corecore